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Abstract: We prove that in order to communicate independent sources between various users over an 
unknown medium to within various distortion levels, it is sufficient to consider source-channel separation 
based architectures: architectures which first compress the sources to within the corresponding distortion 
levels followed by reliable communication over the unknown medium. 
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1. INTRODUCTION 

Architecture, defined as organization of distributed algorithms 
in software and hardware, plays a fundamental role in commu- 
nications, control and computer science. The Von Neumann ar- 
chitecture of a stored program computer still today provides the 
model of computation. The separation theorem for source and 
channel coding in Shannon's theory of information provides an 
architecture for point-to-point communication. 

Consider a controlled finite-state Markoff process (X t (u(-))) t >o 
where X t is the state of the Markoff process at time t. The con- 
trol at time t is u t . Let Z t represent a "partial" observation of 
the state at time t. It is required to choose the control function u t 
at time t based on the past observation (Z s \0 < s < t) in order 

to minimize the expected cost J(u(-)) = E J T c(X t ,u t )dt. 
An important theorem states that the control separates into an 
estimation part, namely, computing the conditional distribution 
TTt(X t \Z sl < s < t) and then computing the optimal control 
Uj by minimizing J considered as a function of the information 
state 7r t ". Again, this leads to an architecture where the con- 
troller separates into an estimator and a controller. These are all 
examples of "layered" architectures. 

In this paper, we consider the question: how does one accom- 
plish communication of various sources with a fidelity criterion, 
that is, to within particular distortion levels, over a common, 
unknown medium, optimally. This question arises in various 
contexts. A classic example is wireless: various users need 
to communicate via voice with each other over the unknown 
wireless medium and voice admits distortion. 

We answer this question under the following 3 assumptions: 

• Distortion measures are additive 

• Sources that need to be communicated between various 
users are independent of each other. More precisely, for 
(i, j) 7^ the source that needs to be communicated 
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from user i to user j is independent of the source that 
needs to be communicated from user i' to user j'. That 
is, the setting is unicast 
• There is a shared source of randomness or common ran- 
domness at various users. Thus, random-coding is permit- 
ted 

We prove that digital communication is optimal to solve this 
problem. Digital communication is optimal is the same as say- 
ing that source-channel separation based architectures, that is, 
architectures where each user first compresses the source to 
within the corresponding distortion levels, followed by univer- 
sal reliable communication of the resulting compressed source 
over the unknown medium, are optimal. There is optimality in 
the sense that if an architecture exists to accomplish this com- 
munication, a separation based architecture exists too. Digital 
communication need not be optimal if there are other require- 
ments (for example, some kind of robustness) in addition to 
the communication of the sources to within required distortion 
levels. 

The source-channel separation theorem that we prove is univer- 
sal and holds for networks. Universality is over the medium of 
communication and not the source. By universality, we mean 
that we do not need to know the exact operation of the medium: 
the medium is uknown. When modeled information theoreti- 
cally, we mean that we do not know the precise operation of the 
network as a transition probability. 

We do not provide any answers for the problem of reliable com- 
munication of bits over a network. This is the classical problem 
of network information theory. Our view is a reductionist view. 
We reduce the problem of rate-distortion communication over 
networks to the classical network information theory problem 
of reliable communication of bits by showing the optimality 
of digital communication/source-channel separation architec- 
tures. 

Section 2 discusses the previous work on and related to this 
problem. Section 3 discusses the system model. The view that 
we will take to solve the problem described above is discussed 
in Section 4. This view is described behaviorally in Section 
5. Section 6 defines various forms of communication. Section 



7 states and proves some theorems which will be helpful in 
proving our main result on universal source-channel separation 
for rate-distortion communication in networks in Section 8. In 
Section 9, we discuss our results with examples and conclude. 

2. PREVIOUS WORK 

Shannon (1959) proved that digital communication is optimal 
for communication with a fidelity criterion in the point-to-point 
case. We differ in that we have solved the network version of 
the problem. Also, Shannon (1959) does not solve the universal 
problem: the channel needs to be known. The universal point- 
to-point rate-distortion communication problem was solved by 
us in Agarwal et al. (2006). Furthermore, Shannon (1959) 
requires some ergodicity assumptions on the channel whereas 
we do not require any ergodicity assumptions on the channel. 
We use a probability of excess distortion definition (3) for 
distortion over blocks compared to the expected distortion 
definition (5) used in Shannon (1959). This change of definition 
allows us to prove universal results for general, not necessarily 
ergodic channels. 

In his thesis, Gastpar (2002) proves optimality of separation 
architectures for certain networks, for example, when indepen- 
dent sources need to be communicated over a multiple access 
channel. Our work differs because we prove separation for 
general networks in the unicast setting, and not just in particular 
examples. We also prove separation in the universal context, 
unlike Gastpar (2002). Universality is possible, as was in con- 
trast with Shannon (1959), because we use a different defi- 
nition of distortion over blocks. Gastpar (2002) also contains 
examples when correlated sources need to be communicated 
over a network to within particular distortion levels. By two 
simple examples, it is shown in Gastpar (2002) that separation 
architectures might not be optimal in this scenerio. The two 
examples are: 

(1) Communication of correlated sources over a multiple- 
access channel 

(2) Communication of the same source to within different 
distortion levels over a broadcast channel. In this scenerio, 
it is proved in Gastpar (2002) that uncoded transmission 
can, in general, perform better than separation based com- 
munication. Note that the communication of the same 
source to two different users belongs to the multi-cast 
setting: the situation can be thought of as two different 
sources, which are infact identical, and hence, not inde- 
pendent, need to be communicated from a user to two 
other users. 

These examples show that in general, the unicast condition is 
necessary for separation architectures to be optimal. Our results 
and the results in Tian et al. (2010) which we discuss in brief 
below, show that the independence assumption is sufficient. 

Tian, Chen, Diggavi and Shamai prove various results concern- 
ing optimality and approximate optimality of source-channel 
separation for rate-distortion in networks in Tian et al. (2010). 
The result which has intersection with our result is where 
they prove optimality of separation based architectures when 
sources are independent of each other, over general networks. 
Results in Tian et al. (2010) are not universal. Results in 
Tian et al. (2010) require that the network have finite memory 
whereas we do not. As we stated above when comparing our 
work with Shannon (1959), these differences are made possible 



because we use a different definition of distortion. Tian et al. 
(2010) also contains interesting results for approximate opti- 
mality of separation architectures in the multi-cast setting: as 
the examples in Gastpar (2002) show, in general, one cannot 
hope for optimality of separation architectures in this setting. 

Separation is talked about in network-coding literature in the 
sense of separation of channel-coding and network-coding. See 
for example, Koetter et al. (2009). However, whenever me 
mention separation, we would mean the separation of source 
and channel coding. 

3. MODEL OF THE GIVEN SYSTEM 

There are various users. The users communicate sources among 
each other. As shown in Figure 1, the system consists of "ar- 
chitecture boxes" interconnected to a medium. The architecture 
boxes which will be refered to as modulators-demodulors or 
modems can be thought of as system protocol and aid commu- 
nication. 

Xij(t) transmitted at user i is received as yij(t) at user j 




Fig. 1 . System Model 
More concretely: 

There are N users. N might change with time. For i ^ j, user 
i communicates source Xij(-) to user j over the system. The 
reproduction of Xij(-) at user j is Yij(-). Vi, Xij(t) e Xij(t) 

Note: Above, when we mention Xij(-), we mean the whole 
trajectory taken by the process over time — oo < t < oo. When 
we mention X l} (t), we mean the value at time t. 

Note the ordering of i and j in Yij(-). 

m denotes the medium, hi , 1 < i < n is the modem at user i. 

Modem hi at user i takes source inputs Xn(-), X i2 (-), . . ., 
Xij(-), . . ., XiN(-). hi takes input 7j(-) from the medium to. 
Modem hi produces an output Oj(-) into the medium to. In 
wireless systems, Ii(-) and Oi(-) are electromagnetic waves. 
Modem hi produces output source reproductions Yii(-), 5^i(-)> 
. . ., Yji(-), . . ., Fjvi(-)- h{-) is an input to the medium to but 
output to the modem hi. Oi(-) is an output of the medium to 
but an input to the modem hi. 

The modems are also assumed to have a common source of 
randomness denoted by C. The input C is the same for all 



modems and can be used by the modems to generate random 
codes. 

The medium takes inputs ii(-), h{-), ■ ■ •> In(-) and produces 
outputs 1 (-),0 2 (-),...,O n (-). 

The modem hi encodes information into input Ii(-). Ii(-) 
contains information about 

(1) Sources Xij(-), 1 < j < N that user i wants to commu- 
nicate to other users. 

(2) Sources Xi/j>(-),i' ^ i. Modem hi has knowledge of 
other other sources Xi>j/(-) which are not inputs at user 
i through the medium output Oi (•). In this case, informa- 
tion about Xi>ji (•) is being relayed through user i. 

Particular realizations of the random source processes and their 
reproductions, and inputs and outputs to the medium will be de- 
noted by Xij(-), Vij{-), tj(-), Oj(-). To avoid mathematical tech- 
nicalities, it is assumed that the system evolves in discrete time, 
say, at every integer time. For the same reason, it is assumed 
that the source alphabet, the source reproduction alphabet and 
the medium input and output alphabet is finite. 

Mathematically, the modem hi is a transition probability 

h i , T {ya(T),l<j<N,i i {T)\ (1) 
Xij(-OO..T - 1), 1 < j < N,Oi(-OQ + ..T - 1), 
C,yji(-00 + ..T - 1),1 < j < N,Li(-00 + ..T - 1)) 

denoting the probability that the 

(1) source reproduction output of modem i at time r are 
yji(r),l<j <N, 

(2) output produced by hi into the medium at time t is is ii (t) 

given 

(1) past source inputs are Xij(t), — oo < t < t — 1,1 < j < 

N, 

(2) past input from medium is Oi(t), —oo < t < r — 1, 

(3) common randomness input is c, 

(4) past source reproduction outputs are yji(t), 1 < j < 
N,0 <t<T-l, 

(5) past output into the medium is ii(t), < t < r — 1. 

Mathematically, the medium is a transition probability 

m T (o 4 (r), 1 < i < N | i 4 (-oo+..T - 1), 1 < i < N, (2) 
0i (-oo + ..T- 1),1 <i<N,S) 

denoting the probability that the medium outputs at time r are 
Oi(r), 1 < i < N given that 

(1) past inputs into the medium were Li{t), — oo < t < t — 
1,1 < i < N, 

(2) past outputs produced by the medium were Oj(i), — oo < 
t < t - 1,1 < i < N, 

(3) and that the initial medium state was s. 

The behavior of the medium m may be complex. The interac- 
tion of medium m and the modems hi and the resulting flow of 
information may be complex. The users may be co-operating. 
There may be multi-hopping and feedback. 

The sources Xij(-) should be thought of as primitive in the 
sense that system behavior, that is, the behavior of the modems 
hi and the medium m do not affect the sources. This is a 
causality assumption. 



The source waveform Xij(-) is reproduced at a later time. 
yij(t m ) is the reproduction of Xij(m) for some t m > to. We 
define the process yij [m] for integer to, denoted with square 
brackets by yij[m) — yij(t m ). We also define Xij [to] = 
Xij (to). In this notation, [to] is the reproduction of [to]. 

4. SPIRIT OF THE QUESTION: HIGH LEVEL 

We ask a question in the following spirit. 

Given a system as above. That is, a system which is known 
to communicate random sources Xij(-) from user i to user j, 
1 < i, j < N over a medium. See Figure 2. 

Let s and r be two particular users. It is known that source 
X sr (-) is communicated from user s to user r over the system 
with some guarantee. Denote the guarantee by G. X sr (-) is 
received as Y sr (-). An example of a guarantee and the one 
we will use is that X sr (-) is communicated to within some 
distortion level. 

We ask a question about the communication of another random 
source X' sr {-) evolving in time in place of source X sr (-) from 
user s to user r. The source X' sr (-) should be received with 
some guarantee G depending on G. The gaurantee G' that we 
will use would be that X' sr (-) needs to be communicated to the 
destination to within some distortion level. 

We will assume that the sources Xij(-) are independent of each 
other Vi, j. This assumption is crucial. 

We will also assume that the source X' sr (-) is independent of 
sources Xij(-)Vi,j. In order to prove the result concerning 
optimality of digital communication as stated in Section 1, it is 
okay to make this assumption. X' sr (-) is primitive in the sense 
discussed in Section 3. 

The changes made in the system for the desired communication 
of X' sr (-) from user s to user r should not change the communi- 
cation of Xij(-) from user i to user j for (i, j) ^ (s, r). Math- 
ematically, this means that Xij(-) should be received precisely 
as Yij(-) in distribution for (i, j) ^ (s, r). Of course, instead of 
X sr (-), X' sr (-) now, needs to be communicated from user s to 
user r. X sr {-) does not need to be communicated any more. 

Each user only has local knowledge. At time r, user i has 
knowledge of the source realization Xij(t),— oo < t < r — 
1,1 < j < N, the modem h i7 medium input realiza- 
tion Li{i),— oo < t < t — 1, medium output realization 
Oi (i),— oo < t < t — 1, the realization of reproduction of 
sources from various users destined for user i, yji(t), — oo < 
t < t — 1,1 < j < N and the common randomness input C. 
User i also has knowledge of any guarantees associated with 
sources at user i, that is, sources Xij(-), 1 < j < N. It is 
known to all users that sources X^, 1 < i, j < N,X' sr are all 
independent of each other. 

Users do not have knowledge of the medium kernel m T defined 
in the previous section. 

System architecture can be changed, only locally. That is, h s 
and h r can be changed in order to communicate the source 
X' sr (-). All other modems should remain the same. That is, for 
i ^ s,r, hi should remain unchanged. 

Question: when can X' sr (-) be communicated to with the re- 
quired guarantee G' and how. 

The definitions of guarantees G and G' will be given later. 



Given: X sr is communicated over the medium from user s to user r with guarantee G 





Question: Can X' sr be communicated in place of X sr to within a guarantee G' 
so that the communication of the other sources is not alfectcd? 



Fig. 2. 

The communication of X' sr (-) will be accomplished in the 
following way: 

Since there is no knowledge of the medium kernel m T , we 
would like to mantain the input-output behavior of the medium. 
If the joint input distribution of the medium inputs Ii(-), 1 < 
i < N is changed, in the absence of the knowledge of medium 
kernel, it is impossible to know the evolution of the medium 
outputs. In order to mantain the medium joint input distribution, 
we would mantain the distribution X sr (-). We would build an 
encoder e which would map the source X' sr (-) into an encoded 
input whose distribution is precisely the same as the source 
process X sr (-). We will thus simulate X sr (-). Denote this 
simulated source by X| r (-). The guarantee G will be satisfied 
between the simulated source X% r (-) and output which we 
denote by Yf r (-). We will then use this output Y s s r (-) to make a 
decoding F s ' r (-) with the use of a decoder d. 

This encoding procedure can be thought of as embedding 
information about X' sr (-) into X sr (-). 

Note that with this encoding-decoding procedure, we will not 
be "disconnecting" the modems h s and h r from the medium. 
The new modem h' s at user s is the composition of h s and 
e. The new modem h' r at user r is the composition of d and 
h r . In other words, we are building "on top of" the existing 
architecture to accomplish the required communication. Note 
that e is a stochastic code. As we shall see later, the way we 



will build the encoder-decoder e — d, there would be need for 
common randomness C between e and d. That is, e — d is a 
random code. See Figure 3. 

By requirement, the modem h\ is the same as hi for i ^ s,r. 

The joint distribution of the inputs to modems hi has been 
mantained. This is because Xij(-) is unchanged for ^ 
(s,r). For = (s,r), the input, now is X^ r (-) instead 

of X sr (-). Xg r (-) has the same distribution as X sr (-). X^ r (-) 
is independent of Xij(-), (i, j) ^ {s,r) by construction and 
because of the assumption that X' sr (-) is independent of Xij(-). 
Thus, the joint distribution at the inputs to modems hi has been 
mantained. As a result, Xij(-) is received precisely as Yij(-) 
for 7^ (s, r). X sr {-) is not transmitted anymore, however. 
Instead, X s sr (-) is transmitted. 

We stated before that we would like the joint medium input 
and output distributions to be mantained. By mantaining the 
distribution of X sr (-), this has automatically happened. 

Note: we are using this way of simulating X sr (-) and "building 
on top" of the already existing architecture in order to commu- 
nicate X' sr (-) from user s to user r. Other ways may exist. This 
is the view and method that we use. 

The assumption of independence of sources Xij(-) is required 
in the above construction for the following reason: 



~X' sr is communicated from user s to user r with guarantee G' with the help of e and dr 
common randomness between e and d 
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Fig. 3. 

Let Xij(-) and X sr (-), 7^ (s,r) be dependent. In order to 
communicate X' sr (-), we simulate X sr (-) as described above. 
This would mean that Xij(-) would also need to be, atleast 
partially simulated in order to respect the joint distribution of 
X sr (-) and Xij(-). This would mean that the system behavior 
would change for the transmission of Xij(-) from user i to user 
j. This is not permitted. 

Modem h' s consumes the same energy as the modem h s . This 
is because the new medium input has the same distribution as 
/,(•). We are neglecting any energy consumption in the circuits 
of the modem. Also, the bandwidth of the medium consumed 
by the modem h' s is the same as the bandwidth consumed by 
the modem h s . This is because the new medium input has the 
same distribution as I s (-). 

In general, consumption of all resources related to the medium 
remains unchanged if we mantain the marginal of /„(•). 

A similar procedure can potentially be followed for communi- 
cation of other sources X[- (•) from a user i to user j, 1 < i, j < 
N . This results in a decentralized system for communication 
of various sources between various users over a network. 



We will elaborate on, and see an application of the reasoning 
described in this section to prove a source-channel separation 
for rate-distortion in networks by making the source X' sr (-) 
have the same distribution as the source X sr (-). This section 
just describes the view. 



5. BEHAVIORAL VIEW 



In this section, we put the ideas discussed in the previous 
section in a behavioral perspective of Willems Willems (1989) 

By convention, a random variable S taking values in a set S has 
a probability distribution denoted by ps- 

Behavior of a stochastic system: The behavior of a stochastic 
system s, B s C {S : S is a random variable taking values in S}. 
If, for example S = 1Z, B s is a subset of all random variables on 
1Z. If, for example, S = Tv^ 0,00 ), B s is a subset of all stochastic 
processes on 1Z^'°°\ 

Interconnection of stochastic systems: Let s be a stochastic 
system with two "terminals" t\ and t 2 - The random variable 
at terminal t\ is Si, taking values in set Si. The random 
variable at terminal t 2 is S 2 , taking values in set S 2 - B s C 
{S\S 2 : S\S 2 is a random variable taking values in Si x 
52}. Similarly, let s' be a stochastic system with two "ter- 
minals" t\ and t' 2 . The random variable at terminal t' x is 
S[, taking values in set S[. The random variable at termi- 
nal t 2 is S' 2 , taking values in set S' 2 . B s > C {S[S 2 : 
S[S 2 is a random variable taking values in S[xS 2 }. Let S[ = 
S 2 . The interconnection of systems s and s', denoted by v, 
when terminal t 2 is connected to terminal t\ is defined behav- 
iorally as follows: B v = {SiXS' ? : SiX e B s and XS' 2 e 
B s '}, where X is the random variable at the terminal x which 
is the interconnection of terminals t 2 and t[. See Figure 4 
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Fig. 4. 

Primitive and non-primitive random variables: Primitive ran- 
dom variables are those which evolve autonomously. An ex- 
ample of a primitive random variable is a source which needs 
to be communicated to a destination. Non-primitive random- 
variables come out of action of systems on primitive random 
variables. An example of a non-primitive random variable is a 
source-reproduction. 

Interconnection of stochastic systems, as defined above might 
not make physical sense in certain cases. 

For example, consider the case when S2 and S[ are independent 
primitive random variables. The above interconnection forces 
S 2 = S[. Even if S[ and S 2 had the same distribution, this 
interconnection does not make physical sense because S 2 and 
S[ are primitive and might not be or evolve in a way that they 
are equal to each other. Such an interconnection might make 
sense if S[ were not primitive, for example, if S[ were an output 
of the system and equal to S 2 . 

Consider another example when Si and S' 2 are primitive and 
independent. The above interconnection might cause a de- 
pendence between the realization of Si and S' 2 which might 
not be consistent with them being independent. However, 
if the behavior of the system s' were B s i = {S[S' 2 ■ 
S[ and S' 2 are independent}, then, the above interconnection 
won't lead to inconsistency. 

The previous section can be summarized in the behavioral view 
as follows. Systems e and d need to be constructed. System 
e needs to be interconnected to h s and system d needs to 
be interconnected to h r as shown in Figure 3. The following 
should be satisfied 

(1) The process X' sr (-)X s sr {-) e B e such that X s sr {-) has the 
same distribution as X sr (-) 

(2) Y s s r Y^ r € B d where Y s s r (-),t G (-00,00) has same 
distribution as Y sr (-), and such that that the guarantee G' 
is satisfied between the processes X' sr (-)) and Yg r (-). 

Note that X sr (-) is not primitive any more because it is no 
longer a source: As stated in Section 4, X' sr (-) needs to be com- 
municated in place of X sr (-). Otherwise, when interconnecting 
e and h s , we would have landed in the problem of the first 
example described above. Also, the reason for the assumption 
of the independence of sources Xij(-),1 < i,j < N made 
in the previous section is for precisely the same reason as the 
second example above. 

6. COMMUNICATION TO WITHIN A DISTORTION 
LEVEL AND ERROR 

Let d : X sr x y sr — > [0,oo) be a function, d is the distortion 
function. For x sr e X sr , y sr e y sr , d(x sr ,y sr ) is the distortion 
incurred if x sr is decoded as y sr . 



Notation: n length sequences will be denoted with superscript 
n. 

Definition: Distortion between n length sequences x" r € X™ r , 

n 

e JC is additive: cT« r , y n sr ) 4 £ d{x n sr [k], !&.[*]). 

fc=i 

1 1 " 

Average distortion is -d n {x n sr ,y™ r ) = - ^ d{x n sr [k], y n sr [k]). 

71 n fe=i 

Definition: Let the source X sr (-) be discrete and evolve as 
X™ (1), X™ r (2), . . ., X™ (n). n is the block-length. The source 
will be denoted by X™ r to make explicit the fact that the block- 
length is n. Recall the last paragraph of Section 3 that the 
source is also denoted as X™ r [1] , . . . , X" r [n] . The reproduction 
of X" r [i] is Yp r [i]. Source X™ r of block-length n is said to be 
communicated to within a distortion level D under metric d 
with error probability < e if 

1 i"(x»[],y»[])>z? 



Pr 



< e 



(3) 



Probability is taken with respect to the joint distribution of 
X™ r [ ] and Y™ r [ ] which can be obtained by marginalization from 
the joint distribution (Xy , Jj , Oj , Y^ j , 1 < i,j < N, < t < 
00). 

Notation and definitions: The source X' sr takes values in the 
set X' sr . The reproduction of X' sr is Y' sr . Y' ST belongs to the 
set y' sr . Analogously as above, then, we can define a distortion 
metric d' and talk about the communication of block-length n' 
source X'™ to within a distortion D' under metric d' with error 
probability < e'. 

Notation and definition: Let A4 n be a message set of cardinality 
2 nR for some R. The message M n is a random variable 
which has some distribution on M n . Note that M n does not 
necessarily have the uniform distribution. For our purpose, the 
precise distribution of M™ will not affect the results. We will 
ask a question about communication of M n from user s to user 
r. Let M n be some decoding of M n after transmission over 
some system. Rate R source M n of block-length n is said to 
be communicated with error probability < 6 under the MBP 
criterion if 

sup Pr(M" ^ M n \M n = m") < S (4) 

MBP stands for Maximal Block Error Probability. In the limit 
as n — > 00, if S — > 0, we say that there is reliable communica- 
tion at rate R under the maximal block error probability (MBP) 
criterion. 

Notation: In what follows, we will sometimes denote the process 
Xij (•) by just Xij and similarly for other processes 



1. BASIC THEOREMS 

In this section, we will prove results concerning communication 
from user s to user r. Communication does takes place between 
other users i and j, 7^ (s, r). We will not be concerned 

with communication beween (i, j) 7^ (s,r) in the sense that we 
do not want that communication to be affected by any changes 
that we make to the system for communication from user s 
to user r. That is, even if we make changes to the system 
architecture, X^ should still be received precisely as Yij if 
(i, j) 7^ (s, r). This was discussed in Section 4. 



Recall the assumptions made in Section 4 that the sources 
JQj(-), 1 < i,j < N,X' sr (-) are independent of each other. 
We also assume that the random sources X sr (-) and X' sr (-) 
are i.i.d. The results can be generalized to stationary ergodic 
sources under some conditions. 

Notation: Since we will be concerned only with communication 
between user s and user r, in order to simplify notation, X sr , 
y sr , X sr , Y sr , X' sr , Y' sr , X s sr , and Yf r will be denoted by X, 
y, X, Y, X', Y', X s , and Y s respectively. 

Consider the source X™ r of block-length n which is denoted 
in simplified notation by X n . It is known that X n is communi- 
cated to within a distortion level D under metric d with error 
probability < e. from user s to user r. This communication is 
the guarantee G in the language of Section 4. Consider the 
source X'™, of block-length n' which is denoted in simplified 
notation by X' n . We ask the question: can source X' n of 
block-length n' be communicated to within a distortion level 
D' under the metric d! with error probability < e' n from user 
s to user r in place of X n in the way described in Section 4. 
This communication of the source X' n is the guarantee G' in 
the language of Section 4. 

Note: We are operating in the framework of information theory. 
In particular, delays do not matter. Any decoding that needs 
to be performed can be performed after observing the whole 
output process. That is, the decoding need not be causal. 

In order to answer this question, we first answer the question of 
communication of rate R source M n defined in the previous 
section over the system under the MBP criterion with error 
probability < 5 n . Another way of saying this in the language of 
Section 4 is the following: X sr = X n . G is the communication 
of source X n to within a distortion level D under metric d with 
error probability < e. X' sr = rate R source M n . G' is the 
communication of rate R source M n under the MBP criterion 
with error probability < 6 n . 

Notation: Let Rx(D) denote the rate-distortion function for 
the source X. See Shannon (1959) for a definition. Shannon 
Shannon (1959) uses an expectation condition when defining 
the rate-distortion function 

lim E -d n (X n ,Y n ) < D (5) 

n— »oo n 

The definition that we use for distortion is the limit of (3) as 
block-length n — > oo, that is, 



lim Pr 

n— >oo 



-d n {X n ,Y n ) > D = 



(6) 



These two rate-distortion functions are the same as proved in 
Agarwal et al. (2006). The dependence of the rate-distortion 
function on the distortion metric is not shown explicitly. 

Notation: The rate-distortion function for source X' with dis- 
tortion D' will be denoted by R x > (£>')• 

Theorem I. Given a system where i.i.d. source X n is commu- 
nicated to within a distortion D under metric d to with error 
probability < e. Let R = Rx{D) — a for some a > 0. Then, 
rate R source M™ (where M™ is arbitrary) can be communi- 
cated under the MBP criterion with error probability < S n from 
user s to user r, in place of communicating source X n by using 
the method described in Section 4, for some S n — > e as n — > oo. 

Proof. This follows from Theorem 1 in Agarwal et al. (2006). 
Note that the codes are generated i.i.d. X and hence, the 



distribution of X n is mantained as required in Section 4. Also 
note that Theorem 1 in Agarwal et al. (2006) is universal: the 
channel might be unknown. Thus, for this theorem, the medium 
might be unknown, as required in Section 4. 

We use the above theorem to prove the result concerning 
communication of source X' of block length n' to within a 
distortion D' under metric d! with error < e' n from user s to user 
r in place of i.i.d. X source of block length n which is known 
to be communicated to within a distortion D under metric d 
with error probability < e from user s to user r. 

Theorem 2. Given that i.i.d. X source of block length n is com- 
municated from user s to user r to within a distortion D under 

• , • , , , -,. r n Rx>(D') 
metric a with error probability < e. Let — > — — + tp 

ri Rx{D) 

for some ip > 0. Then i.i.d. X' source of block length n' can be 
communicated from user s to user r to within a distortion D' 
under metric d' with error probability < e' n for some e' n — > e as 
n — > oo in place of i.i.d. X source. 

Proof. This uses the usual argument of source-coding followed 
by reliable channel coding. Roughly, the argument is the fol- 
lowing. Compress the source X' to within the distortion level 
D. The output is a message set of cardinality 2 mRx '( D \ Com- 
municate the compressed message over the system from user 
s to user r. The message gets communicated correctly with 
probability 1 — e. This communication with probability 1 — e 
can be accomplished because the conditions of the previous 
theorem, Theorem 1 are satisfied. Finally, decode the source. 
End to end, the required communication of i.i.d. X' source is 
accomplished. More precisely, there exist 
source encoders™' : X n ' -+ M n ' = {1, 2, . . . 2 n '^'( D ')+i)} 
and 

source decoders™' : M n ' = {1, 2, . . . 2"'^'^')+ % )} y>' 
such that 



Pr (J^d n {X' n ',si o s™'pf'™') > D^j =r]n>^0 



By an assumption in the theorem, it follows that 



as n — > oo 
(7) 



nRx(D) 



Rx'(D') + ^ 



> n 



Rx(D) V 



Define a 



R X (D) V 



Rx(D) - 
n(R x (D) 



• V 2 

- a) > n' 



R x {D)+4,2 
It follows that 



Rx>{D') + 



(8) 



(9) 



We can, thus, think of the maps s™ and , as source encoder 
si : X n ' M n = {1,2, ... 2™(^( D )-«)} and 
source decoder sj' : M n = {1, 2, . . . 2™'(^<(^')-«)} _> y n ' . 
This is because M n C M™. We can then re-label, and call s™ 
as s™, and call s% as s%. 

First compress the source X' n using the source encoder s™. 
The output M™ = s™(X' n ) is some distribution on M". 
By Theorem 1, it follows that there exists encoder c" and 
decoder such that with these encoder and decoder, M™ of 
rate R X {D) - a is communicated under the MBP criterion 
with error < £„. from user s to user r where ^^easn^oo. 
The decoding of M™ at user r is M n . Now apply the source- 
decoder s™ to M n . We get a decoding Y' n of source X' n . End 
to end, 



Pr (^d n (X' n ',Y' n ') > D'^j < U + T]n = 4 -> e as n -> oo 

(10) 

This proves the theorem. 

Now, we specialize this theorem to the case when X' has the 
same distribution as X and is independent of X. 

Theorem 3. Given that i.i.d. X source of block length n is 
communicated over the system to within a distortion D under 
metric d with error < e from user s to user r. Let D' < D and 
Rx(D') < Rx{D) (note: strictly less). Then, i.i.d. X source of 
block length n can be communicated over the system to within 
a distortion D' under metric d with error probability < e' n 
from user s to user r by using an architecture which consists 
of source compression of X followed by communication of the 
compressed source under the MBP criterion with some error 
probability. By use of this new architecture, end-to-end, the 
i.i.d. X source is communicated to within a distortion level 
D' under the metric d' with error e„ — > e as n — > oo. The 
communication of sources from user i to user j, (i, j) ^ (s, r) 
is not affected by the new architecture. That is, for ^ 
(s,r), if Xij is received as in the given architecture, it 
is received precisely as Yij in the new architecture also. The 
energy and bandwidth consumption in the two architectures is 
the same. 

Proof. This can be proved by use of the previous theorem, 
Theorem 2 with X' = X, n' = n and D' = D' as follows. 

n = n Rx(D') I Rx{D) - Rx{D') 

n> n Rx{D) 2 R X {D) 

Theorem 2 applies with V = \ Rx( i^gy^ ■ 

There exist s", c ™ i C S as m tne previous theorem. The new 
architecture consists of modem h' s = h s o c" o s" at user 
s and modem ft,'. = o c 1 ^ o h r at user r. The required 
communication of source X n from user s to user r in the 
new architecture occurs by using modem h' s and h' r at users 
s and r. Modems for rest of the users remain unchanged. h' a 
can be interpreted as follows. First, source X n is compressed 
using s™. The compressed source M™ is encoded by use of 
h s o c™ so that it is communicated to with maximal block error 
probability < e). h' r can be interpreted as follows. First the 
received sequence Y s ' n is decoded into M™ by use of o h r . 
M n is the estimate of M n with maximal block error probability 
< e). Then, M™ is source-decoded using s^. End-to-end, the 
source X n is communicated to within distortion level D' under 
the metric d with error probability < e' n . from user s to user r 
such that e' n — > e as n — > oo. 

The rest of the statements in the theorem follow from the 
discussion in Section 4. This completes the proof. 

Note: The total time incurred in the end-to-end communication 
of i.i.d. X source (the delay) might be larger in the separation 
architecture as compared to the original architecture. However, 
this does not concern us. 

Note: We have provided a separation architecture for communi- 
cation of source X sr to within a distortion level D under metric 
d but it is built on top of the existing architecture which already 
accomplishes precisely the same communication! This is just 
a proof technique. The proof follows a route of "building on 



top" of the existing architecture. This helps to prove that there 
is no loss of optimality in using separation architectures. In 
practice one can use other separation architectures which are 
not "building on top" of the existing architecture. 

8. APPLICATION TO INFORMATION THEORY: 
SEPARATION FOR RATE-DISTORTION IN NETWORKS 

In this Section, we prove a source-channel separation theorem 
for rate-distortion for networks. Subsection 9 contains a discus- 
sion with examples. 

8.1 Universal source-channel separation for rate-distortion in 
networks 

Information theory is concerned with the behavior of quantities 
in the limit as the block-length n — > oo. As stated before, delays 
do not matter. 

We first consider the question of communication of i.i.d. X 
source X n from user s to user r to within a distortion level D as 
block-length n — > oo. The modem at user i is /i™, 1 < i < N, 
when the block-length is n. We make statements concerning 
optimal architectures for this communication. It is required, as 
stated in Section 4 that the communication between other users 
is not affected. 

Definition: When the block-length is n, modem /i" is used at 
user n, 1 < i < N. The input that needs to be communicated 
from user s to user r is X n . The reproduction of X n at user r 
is Y n . We say that i.i.d. X source is communicated to within a 
distortion level D over the system from user s to user r if (6) 
holds. 

Theorem 4. Let there exist modems h^,l < i < N, 1 < n < 
oo such that i.i.d. X source is communicated from user s to 
user r to within a distortion level D. Let D' > D be such that 
Rx(D') < Rx(D). Then, there exist modems ti™ , 1 < i < 
X, 1 < n < oo such that modem h'" at user i, 

(1) h! n s first source-codes i.i.d. X source X n of block length 
n and this is followed by reliable communication of the 
resulting message to user r. 

(2) h'™ does channel decoding followed by source decoding 
to get a decoding Y n of X n . 

(3) ft'" consumes the same energy and bandwidth as h™ for 
all i, for all n. 

(4) communication of sources between other users in not 
affected in the sense defined in Section 4: X^ is received 
precisely as Y^ for ^ (s,r) even if modemss ft'" 
are used instead of hf , 1 < i < N 

Proof. This follows immediately from Theorem 3. 

Now, we prove a network version of the above theorem: com- 
munication to particular distortion levels is desired between 
various users, not just from user s to user r. 

Notation and definitions: Let A C \ 1 < i, j < X, i ^ 

j}. Let (p, q) e A. Let d pq : X pq x~y pq — > [0, oo) be a distortion 
metric as in Section 6. d pq is additive average distortion defined 
in the same way as d" r is defined in Section 6. Communication 
of source X pq to within a distortion level D pq under distortion 
metric d pq is defined analogously to (6). 

Theorem 5. Let there exist modems h?,l < i < N,l < 
n < oo such that for all (p,q) € A, i.i.d. source X pq is 



communicated from user p to user q to within a distortion 
level D pq . Let D' pq > D pq be such that R Xpq (D' pq ) < 
Rx pq {D pq )\/p, q e A. Then, there exist modems h'™, 1 < i < 
N, 1 < n < oo such that modems ft'™ at user i satisfy the 
following: 

(1) /i'p first source-codes i.i.d. X pq source X pq of block 
length n and this is followed by reliable communication 
of the resulting message to user q. 

(2) h' q does channel decoding followed by source decoding 
to get a decoding Y™ q of X™ q . 

(3) h'i consumes the same energy and bandwidth as /i™ for 
all i, for all n. 

(4) communication of sources between other users in not 
affected in the sense defined in Section 4. 

Proof. This can be done step by step. First carry out the separa- 
tion procedure for one user pair (pi ,q\) in A. This can be done 
by the previous theorem, Theorem 4. After making this change 
of architecture, source X piqi is still being communicated to 
within a distortion level D Piqi from user p\ to user q\. Very 
important, is the fact that sources X^j, ^ (pi, qi) are still 
being received as Yy. In particular, for (p, q) £ A \ (pi, <?i), 
X pq is still communicated to within a distortion level D pq over 
the system. Now choose another user pair (p2,<Z2) € A \ 
(pi , gi ) and repeat the procedure until all user pairs in .4 are 
exhausted. This completes the proof. 

A high-level version of this theorem is the source-channel 
separation theorem for rate-distortion for networks when the 
sources that various users want to communicate to each other 
are independent of each other. 

Theorem 6. Consider a medium m and N users. N might 
change with time. Independent sources Xij are communicated 
from user i to user j, 1 < i, j < N, i ^ j, over the medium. 

is transmitted at user i and received at user j. Let A be a 
subset of user pairs, that is, A C { (i, j) | 1 < i,j < N,i ^ j}. 
For (p, q) e A, it is known that X pq is i.i.d. It is required to 
communicate sources X pq , (p, q) S A to within a distortion 
level Z? P9 over the system under a distortion metric d pq . In order 
to accomplish this communication, it is sufficient to consider 
separation architectures: that is, architectures which compress 
i.i.d. source X pq , (p, q) € A to within the desired distortion 
level and then communicate the compressed message reliably 
over the system. Communication of other sources is not affected 
in the separation architecture in the sense that if X^, (i, j) ^ A, 
and if X^ is received as Y t j in the original archicture, X^ 
is received precisely as Yij in the separation architecture too. 
Of course, X pq , (p, q) e A is not necessarily received as Y pq 
in the separation architecture. However, it is received as some 
Y pq which is to within a distortion D pq of X pq . Energy and 
bandwidth consumption remains the same at each user. Delay 
incurred for communication of sources X i3 ,(i,j)<£A remains 
the same. 

9. DISCUSSION AND CONCLUSION 



based scheme which has the same performance as the orig- 
inal scheme, and this does not require the knowledge of the 
medium. What the result says, then, is that for the problem of 
rate-distortion communication over an unknown medium, it is 
sufficient to restrict attention to separation based protocols. 

For example, consider the case of the internet. Different users 
wish to communicate various sources to each other. Different 
sources have different distortion requirements. For example, 
one user might want to communicate an e-mail to another 
user, for which no distortion is allowed. Another user might 
be chatting via voice or via video with another user, and in that 
case, distortion is permitted. The distortion metric in the case 
of voice and video is not additive, but for sake of the argument, 
suppose that that was the case. The structure of the internet is 
unknown. In fact, it changes with time. We still need to design 
a protocol to meet the desired communication requirements. 
What we prove is that if random-coding is permitted and 
sources that different users want to communicate to other users 
are independent of each other, it is sufficient to restrict attention 
to separation based protocols. 

Another example is wireless communication. Wireless medium 
is time varying and unknown. Users want to communicate voice 
which allows distortion. For sake of the argument, assume that 
the distortion metric for voice is additive. There exist various 
protocols for wireless communication, for example, CDMA 
and GSM. It is a reasonable assumption that what different 
users talk is independent of each other. We prove that assuming 
that random-coding is permitted, one does not lose anything 
by restricting attention to separation-based protocols for the 
question of the number of users which can be communicating 
over the wireless medium at a particular time. 

The above problem of communicating sources with a fidelity 
criterion when the sources are not independent is open in 
general. Source-channel separation based architectures are not 
optimal in general. 
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We have proved a source-channel separation theorem for rate- 
distortion in the network setting when the sources that various 
users wish to communicate with each other are independent 
of each other. Note that the medium is unknown. Assuming 
that random-coding is permitted, for every encoding-decoding 
scheme which achieves the required distortion bounds over the 
medium, we have demonstrated the existence of a separation 



